Abstract 



In a randomly oriented graph containing vertices x and y, denote by 
{x y} the event that there is a directed path from x to y. We study the 
correlation between the events {x — >■ ?/} and {y — )■ z} for a (large) oriented 
complete bipartite graph with orientation chosen uniformly at random. 
We classify the cases of positive and negative correlation respectively in 
terms of the relative proportions of the sizes of the color classes of the 
graph. 
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Path correlations in a randomly oriented 
complete bipartite graph 

Erik Aas 

1 Introduction 

Let G be an aribitrary finite graph wiiose edges e all have been assigned 
probabilities p(e) . We get a random graph by including the edge e in the 
graph with probability p(e) independently of everything else. Let {x -f^ 
denote the event that there is a path connecting the two nodse x and y 
in this random graph. 

In [4] it was observed that when all p(e) — i, the probability P{x -f-)- y) 
coincided with the probability of {x — >■ j/}, the event of there being a 
directed path from a; to y in a uniformly chosen random orientation of 
the edges of G (so here we are comparing the probabilities of two different 
events in two different probability spaces). Clearly choosing an orientation 
uniformly at random amounts to orient each edge either way with equal 
probability 1/2, independently of all other edges. 

A natural question, then, is what other properties those two probabil- 
ity spaces share. 

It is a basic fact [31 that in the undirected random graph model defined 
above, for any increasing events A and B (like {x z} and {z -f-J- y}), 

P{Ar\B) > P{A)P(B), 

that is, A and B are positively (or, to be pedantic, nonnegatively) corre- 
lated. 

An analogue of this fact, stated in [4], is that in any randomly directed 
graph, the events {x — > y} and {y — > z} are positively correlated, that is, 

P{x ^ y ^ z) > P{x — ^ y)P{x — s> z). 

This motivates the following question: Let G be a graph containing 
three vertices x,y,z. Does 

P{x y ^ z) > P{x -> y)P{y — ^ z) 

hold, that ts, are {x y} and {y — > 2} positively correlated? Here, and 
in the following, we write {a; — 1/ — 2} := {a; — n {y — > 2}. 

Obviously this depends on the graph G, as it is easy to find graphs 
for which P(x y ^ z) — P{x y)P{y ^ z) has any given sign 
(including 0), and a simple characterisation of all graphs with, say, positive 
correlation seems hard to find. It is known [1] that for the complete graph 
K„, this quantity is negative for n — 3, zero for n = 4 and positive for 



2 



all n > 4. For a slightly different model, in [2] it is shown that when 
the graph is 'dense', the analogous correlation is positive. Here we will 
study the same correlation between the events {x y} and {y — >■ 2} in a 
uniformly chosen orientation of the edges of the complete bipartite graph 



2 Result 



Throughout the remainder of this note, let A = {x y} and B — {y 
z}. We denote the complement of a set (or an event) A by A'^. We 
define {x y} to be {x — >■ j/}", that is, there is not path from x to 
y. As a technical convenience the object of study will be RCm,n — 
p(A nB^)^^P(A^)P(B ) ^ relative covariance between A'^ and B'^ , rather 
than P{A D B) - P{A)P{B). Observe that P{A D B) - P{A)P{B) = 
P{A'' n B") - P{A'')P{B'') - this holds for any two events A, B. In 
particular, A and B are positively correlated if and only if RCm,n is 
positive. Observe that the relative covariance can be rewritten as (and 
this might be the more convenient way of thinking about it) RCm.n = 
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_ P(A'=)P(B'=) _ , _ PjB") 
P(A<:nB<') ^ P(B<'\A<')' 

We fix some more notation: 

• The nodes of Km,n sue partitioned into two sets X and Y of sizes 
m and n respectively. 

• m = [/3nJ for some fixed positive constant p. 

• m,n > 2. 

• a,b,c,d are four distinct vertices of Km,n', the first three belong to 
X and d belongs to Y. 

• The limit lim„^oo RCm.n is denoted by RC. 
Theorem 2.1. The value of RC is given by the following table. 



X 


Y 


/3<1 


= 1 


/3> 1 


x,y,z 




-1/3 


-1/3 


-1/3 


x,y 


z 


1/2 


1/5 


-1 




y 


1 


1/5 






Table 1: The relative covariance between {x -/^ y) and {y -f^ z\, according to 
which partition the vertices belong to, and to the proportion /3 of the number 
X-vertices to the number of vertices. 



We see that letting m = [/3nJ for a fixed constant /3 is not as restrictive 
as might seem at first thought. 



3 Proof 

The proof of Theorem 12 . 1 1 will follow from a rmmber of lemmas estimating 
the probabilities P{A''),P{B''),P{A'' n B") in terms of n. A common 
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feature of these estimates is that the lower bounds, which are trivial to 
obtain, are elose to the harder-to-prove upper bounds. If S and T are two 
disjoint sets of vertices in Km,n, an 'ST- witness' is defined to be a vertex 
u for which there is at least one edge from S to w and at least one edge 
from u to T. 

For a vertex s in Km,n, the set Oa will loosely be defined as the set of 
vertices in X' U Y' (thus in X' it a £ Y and in y' if a £ X) which can be 
reached in exactly one step from s, X' and Y' being defined separately 
in each section where this notation is used. We denote \X'\ and \Y'\ by 
m' and n'. The set /„ of vertices which reach a in exactly one step is 
similarly defined. 

Estimating these probabilities will be a lot of repetitive work. The 
following inequality will be used several times: ii s, t > a, then st > 

as + at — a'^ . 

Below, when summing over subsets of nodes denoted by upper case 
letters, the sizes of these sets will often be denoted by the corresponding 
lower case letters. 

To estimate sums of the form Yl"=o T,7=o (") ("D ilY' ^e will split 
them into several parts according to whether s > a or t > a for some 
suitably chosen constant a (depending only on 

(i) P(b ^ a) 

Lemma 3.1. 

Proof. Let X' = X - {a, &}, Y' = Y. 

A lower bound is given by P{b 7^ a) > P({there is no edge directed away from 6}U 
{there is no edge directed towardsa}) = 2 (f)"— (|) by inclusion-exclusion. 

By calculating the probability that there is no path from & to a of length 
at most 4, we get the following upper bound: P{b -/^ a) = ^ j-^y A 

a\Ob = SJa = T)P{Ob = S,Ia =T) < (i)'"Es,Tcy:SnT=0^(""a: e 

X' is an ST-witness) = (i)^" ELo Er=7 OCD {ikY + (I)* - ihY^T'^ ■ 

Note that the partial sum corresponding to st = is equal to the 
lower bound. We now show that the other terms sum to o((i) ). Split 
the remaining sum into the following four parts: Si: s, t > a; S2: 1 < 
s < a <t; S3: 1 < t < a < s; S4 1 < s,t < a. 

Note that in the Si case, (i)" + (i)* - (1)"+* < Hence Si < 

(i)^" ^„ ^n^ E"~* (^^Y"' = (i)^""'"'"""'^'''""^' 3" = o((i)'"~^'''"~^') = 

o((^)"), the last equality holding when choosing a large enough. 

S2 < ikY^^u c) EL-.' (T) {ihY + ikYT~' < (i)'"E:=in«Er=o a) m + ihrr'" < 
(ir«-"((i) + (irr"=«((mif«>o. 

By symmetry, we may choose a possbily even larger so that S3 = 
o((i)") holds. 

Clearly, S4 = o((i)"). 

Hence P{b a) - 2 (i)" < Si + S2 + S3 + S4 = o((i)"). □ 
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(ii) P{d ^ a) 

Lemma 3.2. P{d + a) ~ (i)™ + (i)". (This is ~ (i)"' for /3 < 1, 

~ 2 (i)" for 13 = 1, and ~ (i)" for 13 > I.) 

Proof Let X' = X - {a}, Y' = Y - {d}. The probability is bounded 
from below by P{d a) > P(no edge leaves d or no edge enters o) > 

For the upper bound, we calculate the probability that there is no path 
from d to a of length at most 3: P{d -/^ a) = J^gcy tcx' '^1-^°- ~ 

S, Oa = T)P{Ia = S,0, = T)< Er^o Er=o' C"7')^(^io edge ftom 5 to T) = 

^i^^+„_i ^„_i ^„_i^ ^i^st partial sum with st = 

equals the lower bound. We now show that the remaining terms sum to 
"((i)*" + (i)") by splitting their sum into the following cases: Si: s, 
t > a, 82'- 1 < s < a, and 53: 1 <t < a. 

Using s,t>a^st> as+at-a\ = (1)"+-^ T.:Zl ("7^) E^.^ ("7^) (i)" < 

/ 1 \ m+n— 1 y^n — 1 /n— 1\ Y^m— 1 /m — 1\ / \ \c^s-\-at — a / i \ m+n — 1 ^c^2 /'^— 1\ v^m— 1 /m— 1\ / 1 \ ^ 

\2) Z^s^a \ s ) 2^t^a \ t ) \2> - \2) ^ 2^s = \ s } \l) l-it=0 { t ) {2) — 

Q^m+n-i^.^ (l+(i)")'"+"-'=o((i)'" + (i)"), Choosing « large enough. 
Similarly, & = (1)"+-^ E^.i ("7^) ES' (T^) (I)" < (1)'"^""' («- 

l)"«E-o^ r;') = - 1)" (1)1!)'""^ = K(^)'" + (I)")- 
A similar argument shows S'3 = o((i) +(|) )• 
Hence P(d a)-((i)'" + (l)") < Si+Sa+Sa = o((i)"'+(i)"). □ 

(iii) Piby^dy^ a) 

Lemma 3.3. P(6 + d + a) ~ 2 + (i)'" 

Proof. Let X' = X — {a, 6}, F' = F — {d}. For the lower bound, we calcu- 
late the probability P{{ the edge between 6 and d, and the edge between 

d and a, form a directed path from a to &} Pi {{Oi, = Od = 0} U {Ob = 

= 0}u{/. = = 0})), which is 2 + (i)'" - (D^+^-a, by 

inclusion-exclusion; P{b^d^a)>2 + (i)'" - (1)™+'"-'. 

To get a working upper bound, it is sufficient to calculate the proba- 
bility of there being no path from & to d or from d to a, either of length at 
most 3. Conditioning on la = S,Ob = T, Id = U,Od = V, there may be 
no edge from T to U, nor from V to S. The edges {6, d} and {a, d} form 
a directed path from a to b. This implies that S and T must be disjoint. 

Hence 

P{b-/^dy^ a) 



= Yl P{b^d^a\Ia = S,Ob=T,Id = U,Od = V)P{Ia = S,Ob = T,Id = U,Od = V) 

S,T,U,V 



-,\m+2n — 2"— 1 / n-1 — s / \ m — 2 / \ / ^ \ tu + s(m — 2 — u) 



„ X s I ^-J \. t j \ u / V 2 

s=0 \ / t=0 \ / u=0 



E 
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The sum of the terms for which s = t = 0, t = u = or t = u — 
(m — 2) = equals the lower bound. The remaining sum is split into the 
following cases: ^i: s, t > a, S2: l<t<a<s, S3: l<s<a<t, and 
S4 : 1 < s,t < a. 

•^l < 2 (,2) Z.. = a ( . j 2^t = c [ t ) 2^u=0 (2) 

^1^2n+m-2 gn-l ^ j_ -j (" - 1) (™-2) _ ^^l^m + n.-2 ^1.^2n^ 

o ^ / l\2n+m-'2 t-^.a /n- 1\ v-^n- 1 - s /n— 1 — s\ v^m-2 / 1 \ tu+s(m — 2 — m) 
•^2 < (2) Ls = l ( s j 2.t=c ( t jL„=0 (2) < 

l)«a2"-i (1 + = + ihT"). 

By symmetry with 52, we deduce S3 = o((i) ^ + (|)^")- 



We conclude that 

P(6 7^ d 7^ a) ~ 2 



^ \ m+n — 1 / 1 ^ 



2/ V2 



□ 



(iv) c-/^ b-/^ a 

Lemma 3.4. 



P(c 7^ 6 7^ a) ~ 3 




Proof. For the lower bound, note that P{c -/> b -/^ a) > P{{Oc = Ob = 

0} U {Oe = /„ = 0} U {h = Ja = 0}) > 3 (i)'" - 2 (i)'". 

For the upper bound, we will sum over U CY',V = Y' — U,SCU, 
and T C V. When doing so, an expression for the probability that a 
given vertex x' € X' is not a TU -witness and not a FS- witness is needed. 
The probability of the complementary event is the probability of x' being 
a Tf7-witncss or a VS- witness. The separate probabilities for these last 
two events arc P{x' is a TtZ-witness) = ^1 — (f)'^'^ ^1 — (f)''^'^ s-nd 

P{x' is an SV-witncss) = (l - (5)'^') (l - (i)'^')- The probability of 
their intersection is (l - (f)'"^') (l - (5)'^'), using S CU,T CV, By 
inclusion-exclusion, we get P{x G X' is not a T{7-witness, nor a V^/S-witness) = 

which simplifies to (i)!^' + (i)l^l - (D'^I+I^' - (1)1^1+1^1 + (D'^I+I^' 

P{c-/¥b-/^ a) 



= P{c-/¥hT^a\Ia = S,Oc = T,h = U,Ob = V)P{Ia = S,Oc=T,h = U,Ob = V) 

S,T,U,V 

a; e X' is a TZ7-witness, nor a l^S'-witness) 

S,T,U,V 
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+ 



The sum of the terms with s — t — Ooru — Oorv — equals the lower 
bound. The other terms sum to o((i) as we now turn to show. Since 
(w, t) and {v, s) are interchangeable, we need only consider the following 
cases: Si: s, t > a; 82- 1 < s < a < t,u; S3: l<s<a<t, l<u<a; 
Sa: 1 < s.t < a. 



U=OL,U + V = 7l 



h +V2 



< 



E :; E ! E : 



2 J \u I \ s I \ t 

u — i),u-\-v — n \ / s — \ / t — 
• 1 \ 2n, 



(a-2){m-3) 



0{ 



(/3(a-2)+l)7, 



which is o((|) ") when choosing a large enough (e.g. a > 2(1 + 1//3)). 
Note that t > a ^ v > a. 

large enough a 



E:=a DE:=in"E:=aa) (3(1)") 



for 



53 < (1)^" E:=r E:=r EL„ ((I)" + (I) + {WT'' = -((I)'") 



for large enough a. 

54<(i)^"aV"=o((i)^"). 

Consequently P(c 7^ 6 7^ a) - 3 (i)^" = o((i)^"). 



□ 



(v) d-/^b jh- a 

Lemma 3.5. P(d ^ b ^ a) (i)'"^""^ 



' X \ 2n 
^2) 



Proof. Let X' = X - {a, b}, Y' = Y - {d}. 

As before, we have the simple lower bound: P{d -/^ h -/^ a) > P[{Od = 



O6 = 0} U {Oa = = 0} U {76 = la = 0}) > (1)'"+"-' 

/ 1 \ "^+2n— 3 
U j 

We bound the probability from above by the probability of there being 
no path from d to 6 or from & to a of length at most 3 or 4 respectively. 
The edges {a, d} and {b, d} are both directed towards d. Condition on 

Od ^ T, h = U, la ^ S, Ob ^ V. No edge is directed from T to U, 
and S* C f/. These conditions imply that no a; £ S is a V(7-witness. In 
addition we forbid any a; £ X' — T to be a yS-witnoss. The events 'x is a 
FS-witness' are independent for a; £ X' — T and independent of the other 
necessary events just stated. We obtain 



P{dy^by^ a) 



= P{d^b^a\Od = T,h^U,Ot, = V,Ia = S)P{Od = S,h = T,Ia = U,Ia = S) 

S,T,U,V 
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^2. 



■ T \ 2n+m — 2 m — 2 / „\ n — l , 

1 \ / m — 2 \ n — l 



E 



E 



Era 



1 \ 2n4-m — 2 t^ — 1 / i 

1 \ v-^ -t-^ /n—l 



EE 



+ 



ir-m" (i)^"-^™-^ 



1- 2 \ 2n / / 1 \ m-\-'ri' — 2 

+"([2) 



For s = Owe obtain the following sum: E^Io ("j') iilT + 



For V = 



X \ m + n — 2 



(I) 



/ 1 \ 2n+m — 2 /n — 1\ 

(2) E.^o ( s ) 

m-2 

) , as lim 



1 + 



X \ ri — 1 



Split the remaining sum into the following cases: Si : 1 < u < a, S2 
1 < s < a < u,v, S3 : a < s,u,v, S4, : 1 < s,v < a, S5 : 1 < V < a < s. 



S2 < n 



1 \ m+2n — 2 , / 1 \ 2n ^ 
2) 

a I 1 \ m + 2n — 2 ^n — a ln — \\ I n I \ \<^ 



Clearly, Si = o((i)™-^ + (i 



En — 
Ti — 



(2 1 



^]^\s\m — 2 
k2) ) 



:0(( 



]^ \ m + 2ri — 2 

2) 



(5) ")j for " > 21, which is easily seen by considering /? < 1 and /3 > 1 
separately. 

C ^ 1 ^ ™+2»i-2 /n-l\ v^u /u\ /q /l^a\m-2 /jn m + 2n — 2 / 3 

•^3 < (2) Z.„=c« ( „ j2^s=c (J (3- (-) ) < '-^ ' — 

3 =01(2) 



^ 2 / Z-^u—OL 
'l*^m + n — 2 ^ ^;L'^2n^ 



54 = 



1 \ m + 2n — 2 



+ (i) since S4 is the sum of a constant {cP) 



number of o((|)'"^^" ^ + (|)^") terms. 
o((i)'"+'"-' + (!)'") for a > 2. 



(1)-+" ((1) + 



Finally, 



P(d /> 6 /> a) 



m + n — 2 



□ 



We now show how to use the lemmas above to prove Theorem 1. For 
example, suppose P < 1, x,y £ X , and z £Y. Then 



RCm^n — 1 

P{a 7^ h)P{h A rf) _ 



P{x ^ y)P{y /> z) 



P{a^by^ d) 



P{xy^yj^ z) 

P{b 7^ a)P[dj^a) _ 
P{djAbj^ a) ^ 

2(i)"((|)™+(|)") 



(by lemmas [331 13. II and 13.2 



' 1 \ m-\-n 



Now, since /? < 1 we get 



lim RCm,n = lim 1 



2(1 -(^)""") 
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The other entries in the table in the statement of the theorem can be 
found similarly. 



4 Further questions 



I would like to mention two questions: 
Question 1: 

How do the results above change if an edge e = {x, y}, where x £ X 
and y €Y, rather than being oriented either way with equal probability, 
is directed from X to y with some fixed probability p? 

Question 2: 

From Theorem 1, it seems plausible that {x -/^ y} and {y 2} should 
be negatively correlated in any complete bipartite graph (at least in any 
large enough graph) when x, y, z belong to the same color class. This 
seems not to be the case, however; computer calculations show that if n 
is much larger than m (on the scale n — 2'"), then the events mentioned 
seem to be positively correlated even for fairly large n. Is this true in 
general? Can the cases with positive correlation be completely identified 
in terms of m and n? 

The calculations mentioned above made use of the following recursions: 

fx{m,n,k) = ^ ^"j ^-^;iF^/y("^ " k,n,l), 

f ( n ^ [""-A (2" - 1)' , , , , , 

/y(m,n,0 = 2^ I , I ^^fc — fx{m,n-l,k), 

k=o \ / 
gx{m, n,k) = 2_^\^ j — ^ — (m - k, n, I), 

. '^7m-2^ (2'-!)'= , 

gY{m,n,l) = \ ^ I — ^j^;^gx(m,n - /, fc), 

fe=0 \ / 

hx{m,n,k) = J2^^ ^ 2"^^'' /iy(m - fc, n, 

and 

hY{m,n,l) = \ ) 2"^fe hx{m,n - l,k). 

k=0 \ / 

where fx, fv, 9x, 9y, hx and hy are defined as follows. 

Let Pm,n denote the probability measure associated with a uniformly 
chosen orientation of Km,n, where the class X has size m and the class Y 
has size n. For a subset K of vertices of Km,n and a vertex x in Km,m let 
{K -/^ x} = Hfceif i'^ 7^ a,b,c £ X,d G Y and K be any subset of 

X, not including a or b, of size k and L be any subset of Y, not including 
d, of size I. 

Then fx{m,n,k) = Pm,n{K -fr a), fYim,n,l) = Pm,n{L a), 
gx{m,n,k) = Pm,n{K -fr b and b -/^ a), gY{m,n,l) = Pm,n{L -/^ b 



9 



and b -/^ a), hx{'rn,n,k) = Pm.n(K -/^ d and d -/^ a), and hY{m,n,l) = 
Pm,n{L -/^ b and b a). 

In addition, we have the following base cases for the formulas above: 
gx{m,n,0) = gY{m,n,0) — /x(m,n,l), /ix (m, n, 0) — /iy(m,n,0) = 
fY{m,n, 1), and fx{m,n,0) = /y(m,n,0) = 1. 

These functions are related to the quantities estimated in the lem- 
mas above by /x(m,n, 1) = Pm,„(a -/^ b), /y(m,n, 1) = Pm.n{a -fr d), 
gx(m,n,l) = P,„,„(c -/^ b -/^ a), gY(m,n,l) = P{d -/^ b a), and 
hx{rn,n, 1) = P(c -fr d -/^ a). 
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